I-vector speaker verification based on phonetic information under transmission channel effects
نویسندگان
چکیده
Past studies have shown evidence of important speakerspecific content in the higher frequencies of the spectrum, which are filtered out by narrowband channels. Besides, wideband transmissions, which are gaining ground over narrowband communications, offer an extended range of frequencies which account not only for better speech quality and intelligibility, but also for an improved speaker recognition performance. In this work, different phoneme classes (fricatives, nasals, and vowels) were removed from speech of different bandwidths, and a series of i-vector based speaker verification experiments were conducted. Our results show that the performance enhancement with clean wideband speech with respect to clean narrowband speech is principally due to the presence of unvoiced fricative consonants. The effects of codec schemes of different bandwidths on the aforementioned speech are discussed.
منابع مشابه
DNN i-Vector Speaker Verification with Short, Text-Constrained Test Utterances
We investigate how to improve the performance of DNN ivector based speaker verification for short, text-constrained test utterances, e.g. connected digit strings. A text-constrained verification, due to its smaller, limited vocabulary, can deliver better performance than a text-independent one for a short utterance. We study the problem with “phonetically aware” Deep Neural Net (DNN) in its cap...
متن کاملGeneralized I-vector Representation with Phonetic Tokenizations and Tandem Features for both Text Independent and Text Dependent Speaker Verification
This paper presents a generalized i-vector representation framework with phonetic tokenization and tandem features for text independent as well as text dependent speaker verification. In the conventional i-vector framework, the tokens for calculating the zeroorder and first-order Baum-Welch statistics are Gaussian Mixture Model (GMM) components trained from acoustic level MFCC features. Yet bes...
متن کاملTemplate-matching for text-dependent speaker verification
In the last decade, i-vector and Joint Factor Analysis (JFA) approaches to speaker modeling have become ubiquitous in the area of automatic speaker recognition. Both of these techniques involve the computation of posterior probabilities, using either Gaussian Mixture Models (GMM) or Deep Neural Networks (DNN), as a prior step to estimating i-vectors or speaker factors. GMMs focus on implicitly ...
متن کاملSpeaker verification based on broad phonetic categories
In this work we present a speaker verification system based on 4 broad phonetic categories: vowels+diphthongs, fricatives, glides+nasals, and silence+stops. Using these categories separately, it is observed that vowels, diphthongs, and fricatives are the most important categories for speaker verification. This observation confirms the results from the analysis of speaker and channel variability...
متن کاملContent Normalization for Text-independent Speaker Verification
In the past few years, Deep Neural Network (DNN) based ivector Speaker Verification (SV) systems have shown to provide state-of-the-art performance. However, error rates increase drastically for short duration recordings. In this paper, we improve the i-vector approach for short utterances, (i) by using smoothed DNN posteriors for i-vector extraction, and (ii) by normalizing the content of the ...
متن کامل